44 research outputs found

    Network and multi-scale signal analysis for the integration of large omic datasets: applications in \u3ci\u3ePopulus trichocarpa\u3c/i\u3e

    Get PDF
    Poplar species are promising sources of cellulosic biomass for biofuels because of their fast growth rate, high cellulose content and moderate lignin content. There is an increasing movement on integrating multiple layers of ’omics data in a systems biology approach to understand gene-phenotype relationships and assist in plant breeding programs. This dissertation involves the use of network and signal processing techniques for the combined analysis of these various data types, for the goals of (1) increasing fundamental knowledge of P. trichocarpa and (2) facilitating the generation of hypotheses about target genes and phenotypes of interest. A data integration “Lines of Evidence” method is presented for the identification and prioritization of target genes involved in functions of interest. A new post-GWAS method, Pleiotropy Decomposition, is presented, which extracts pleiotropic relationships between genes and phenotypes from GWAS results, allowing for identification of genes with signatures favorable to genome editing. Continuous wavelet transform signal processing analysis is applied in the characterization of genome distributions of various features (including variant density, gene density, and methylation profiles) in order to identify chromosome structures such as the centromere. This resulted in the approximate centromere locations on all P. trichocarpa chromosomes, which had previously not been adequately reported in the scientific literature. Discrete wavelet transform signal processing followed by correlation analysis was applied to genomic features from various data types including transposable element density, methylation density, SNP density, gene density, centromere position and putative ancestral centromere position. Subsequent correlation analysis of the resulting wavelet coefficients identified scale-specific relationships between these genomic features, and provide insights into the evolution of the genome structure of P. trichocarpa. These methods have provided strategies to both increase fundamental knowledge about the P. trichocarpa system, as well as to identify new target genes related to biofuels targets. We intend that these approaches will ultimately be used in the designing of better plants for more efficient and sustainable production of bioenergy

    Linking crop traits to transcriptome differences in a progeny population of tetraploid potato

    Get PDF
    Background Potato is the third most consumed crop in the world. Breeding for traits such as yield, product quality and pathogen resistance are main priorities. Identifying molecular signatures of these and other important traits is important in future breeding efforts. In this study, a progeny population from a cross between a breeding line, SW93-1015, and a cultivar, Desiree, was studied by trait analysis and RNA-seq in order to develop understanding of segregating traits at the molecular level and identify transcripts with expressional correlation to these traits. Transcript markers with predictive value for field performance applicable under controlled environments would be of great value for plant breeding. Results A total of 34 progeny lines from SW93-1015 and Desiree were phenotyped for 17 different traits in a field in Nordic climate conditions and controlled climate settings. A master transcriptome was constructed with all 34 progeny lines and the parents through a de novo assembly of RNA-seq reads. Gene expression data obtained in a controlled environment from the 34 lines was correlated to traits by different similarity indices, including Pearson and Spearman, as well as DUO, which calculates the co-occurrence between high and low values for gene expression and trait. Our study linked transcripts to traits such as yield, growth rate, high laying tubers, late and tuber blight, tuber greening and early flowering. We found several transcripts associated to late blight resistance and transcripts encoding receptors were associated to Dickeya solani susceptibility. Transcript levels of a UBX-domain protein was negatively associated to yield and a GLABRA2 expression modulator was negatively associated to growth rate. Conclusion In our study, we identify 100's of transcripts, putatively linked based on expression with 17 traits of potato, representing both well-known and novel associations. This approach can be used to link the transcriptome to traits. We explore the possibility of associating the level of transcript expression from controlled, optimal environments to traits in a progeny population with different methods introducing the application of DUO for the first time on transcriptome data. We verify the expression pattern for five of the putative transcript markers in another progeny population

    Fungal-Bacterial Networks in the Populus Rhizobiome Are Impacted by Soil Properties and Host Genotype

    Get PDF
    Plant root-associated microbial symbionts comprise the plant rhizobiome. These microbes function in provisioning nutrients and water to their hosts, impacting plant health and disease. The plant microbiome is shaped by plant species, plant genotype, soil and environmental conditions, but the contributions of these variables are hard to disentangle from each other in natural systems. We used bioassay common garden experiments to decouple plant genotype and soil property impacts on fungal and bacterial community structure in the Populus rhizobiome. High throughput amplification and sequencing of 16S, ITS, 28S and 18S rDNA was accomplished through 454 pyrosequencing. Co-association patterns of fungal and bacterial taxa were assessed with 16S and ITS datasets. Community bipartite fungal-bacterial networks and PERMANOVA results attribute significant difference in fungal or bacterial communities to soil origin, soil chemical properties and plant genotype. Indicator species analysis identified a common set of root bacteria as well as endophytic and ectomycorrhizal fungi associated with Populus in different soils. However, no single taxon, or consortium of microbes, was indicative of a particular Populus genotype. Fungal-bacterial networks were over-represented in arbuscular mycorrhizal, endophytic, and ectomycorrhizal fungi, as well as bacteria belonging to the orders Rhizobiales, Chitinophagales, Cytophagales, and Burkholderiales. These results demonstrate the importance of soil and plant genotype on fungal-bacterial networks in the belowground plant microbiome

    High Throughput Screening Technologies in Biomass Characterization

    Get PDF
    Biomass analysis is a slow and tedious process and not solely due to the long generation time for most plant species. Screening large numbers of plant variants for various geno-, pheno-, and chemo-types, whether naturally occurring or engineered in the lab, has multiple challenges. Plant cell walls are complex, heterogeneous networks that are difficult to deconstruct and analyze. Macroheterogeneity from tissue types, age, and environmental factors makes representative sampling a challenge and natural variability generates a significant range in data. Using high throughput (HTP) methodologies allows for large sample sets and replicates to be examined, narrowing in on more precise data for various analyses. This review provides a comprehensive survey of high throughput screening as applied to biomass characterization, from compositional analysis of cell walls by NIR, NMR, mass spectrometry, and wet chemistry to functional screening of changes in recalcitrance via HTP thermochemical pretreatment coupled to enzyme hydrolysis and microscale fermentation. The advancements and development of most high-throughput methods have been achieved through utilization of state-of-the art equipment and robotics, rapid detection methods, as well as reduction in sample size and preparation procedures. The computational analysis of the large amount of data generated using high throughput analytical techniques has recently become more sophisticated, faster and economically viable, enabling a more comprehensive understanding of biomass genomics, structure, composition, and properties. Therefore, methodology for analyzing large datasets generated by the various analytical techniques is also covered

    Multi-Phenotype Association Decomposition: Unraveling Complex Gene-Phenotype Relationships

    Get PDF
    Various patterns of multi-phenotype associations (MPAs) exist in the results of Genome Wide Association Studies (GWAS) involving different topologies of single nucleotide polymorphism (SNP)-phenotype associations. These can provide interesting information about the different impacts of a gene on closely related phenotypes or disparate phenotypes (pleiotropy). In this work we present MPA Decomposition, a new network-based approach which decomposes the results of a multi-phenotype GWAS study into three bipartite networks, which, when used together, unravel the multi-phenotype signatures of genes on a genome-wide scale. The decomposition involves the construction of a phenotype powerset space, and subsequent mapping of genes into this new space. Clustering of genes in this powerset space groups genes based on their detailed MPA signatures. We show that this method allows us to find multiple different MPA and pleiotropic signatures within individual genes and to classify and cluster genes based on these SNP-phenotype association topologies. We demonstrate the use of this approach on a GWAS analysis of a large population of 882 Populus trichocarpa genotypes using untargeted metabolomics phenotypes. This method should prove invaluable in the interpretation of large GWAS datasets and aid in future synthetic biology efforts designed to optimize phenotypes of interest
    corecore